Overview

Dataset statistics

Number of variables13
Number of observations361037
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory35.8 MiB
Average record size in memory104.0 B

Variable types

Numeric8
Categorical5

Alerts

imp_hash has constant value "25c7ac00c91884fd2923a489ae9dfbca" Constant
filename has a high cardinality: 733 distinct values High cardinality
sha256 has a high cardinality: 10251 distinct values High cardinality
sec_md5 has a high cardinality: 9182 distinct values High cardinality
sec_name has a high cardinality: 15918 distinct values High cardinality
df_index is highly correlated with Unnamed: 0 and 1 other fieldsHigh correlation
Unnamed: 0 is highly correlated with df_index and 1 other fieldsHigh correlation
win_count is highly correlated with df_index and 1 other fieldsHigh correlation
sec_entropy is highly correlated with virtual_addressHigh correlation
raw_size is highly correlated with virtual_sizeHigh correlation
virtual_size is highly correlated with raw_size and 1 other fieldsHigh correlation
virtual_address is highly correlated with sec_entropy and 1 other fieldsHigh correlation
df_index is highly correlated with Unnamed: 0 and 1 other fieldsHigh correlation
Unnamed: 0 is highly correlated with df_index and 1 other fieldsHigh correlation
win_count is highly correlated with df_index and 1 other fieldsHigh correlation
sec_chi2 is highly correlated with raw_size and 1 other fieldsHigh correlation
sec_entropy is highly correlated with raw_size and 2 other fieldsHigh correlation
raw_size is highly correlated with sec_chi2 and 2 other fieldsHigh correlation
virtual_size is highly correlated with sec_chi2 and 2 other fieldsHigh correlation
virtual_address is highly correlated with sec_entropyHigh correlation
df_index is highly correlated with Unnamed: 0 and 1 other fieldsHigh correlation
Unnamed: 0 is highly correlated with df_index and 1 other fieldsHigh correlation
win_count is highly correlated with df_index and 1 other fieldsHigh correlation
raw_size is highly correlated with virtual_sizeHigh correlation
virtual_size is highly correlated with raw_sizeHigh correlation
df_index is highly correlated with Unnamed: 0 and 1 other fieldsHigh correlation
Unnamed: 0 is highly correlated with df_index and 1 other fieldsHigh correlation
win_count is highly correlated with df_index and 1 other fieldsHigh correlation
sec_chi2 is highly correlated with raw_size and 2 other fieldsHigh correlation
sec_entropy is highly correlated with raw_size and 2 other fieldsHigh correlation
raw_size is highly correlated with sec_chi2 and 3 other fieldsHigh correlation
virtual_size is highly correlated with sec_chi2 and 3 other fieldsHigh correlation
virtual_address is highly correlated with sec_chi2 and 3 other fieldsHigh correlation
df_index has unique values Unique
Unnamed: 0 has unique values Unique
sec_entropy has 259677 (71.9%) zeros Zeros

Reproduction

Analysis started2022-08-22 03:18:26.317599
Analysis finished2022-08-22 03:18:41.522832
Duration15.21 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct361037
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4440196.256
Minimum890523
Maximum5674752
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.8 MiB
2022-08-22T13:18:41.595192image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum890523
5-th percentile3535314.8
Q14263319
median4514605
Q34776000
95-th percentile5097902.2
Maximum5674752
Range4784229
Interquartile range (IQR)512681

Descriptive statistics

Standard deviation622718.1196
Coefficient of variation (CV)0.1402456296
Kurtosis11.46646499
Mean4440196.256
Median Absolute Deviation (MAD)254991
Skewness-2.921966442
Sum1.603075136 × 1012
Variance3.877778565 × 1011
MonotonicityStrictly increasing
2022-08-22T13:18:41.842623image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8905231
 
< 0.1%
46830301
 
< 0.1%
46829081
 
< 0.1%
46829071
 
< 0.1%
46829061
 
< 0.1%
46828871
 
< 0.1%
46828861
 
< 0.1%
46828851
 
< 0.1%
46828841
 
< 0.1%
46828831
 
< 0.1%
Other values (361027)361027
> 99.9%
ValueCountFrequency (%)
8905231
< 0.1%
8905241
< 0.1%
8905251
< 0.1%
8905261
< 0.1%
8905271
< 0.1%
8905281
< 0.1%
8905291
< 0.1%
8905301
< 0.1%
8905311
< 0.1%
8905321
< 0.1%
ValueCountFrequency (%)
56747521
< 0.1%
56747511
< 0.1%
56747501
< 0.1%
56747491
< 0.1%
56747481
< 0.1%
56747471
< 0.1%
56747461
< 0.1%
56747451
< 0.1%
56747441
< 0.1%
56747431
< 0.1%

Unnamed: 0
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct361037
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4440196.256
Minimum890523
Maximum5674752
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.8 MiB
2022-08-22T13:18:41.937469image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum890523
5-th percentile3535314.8
Q14263319
median4514605
Q34776000
95-th percentile5097902.2
Maximum5674752
Range4784229
Interquartile range (IQR)512681

Descriptive statistics

Standard deviation622718.1196
Coefficient of variation (CV)0.1402456296
Kurtosis11.46646499
Mean4440196.256
Median Absolute Deviation (MAD)254991
Skewness-2.921966442
Sum1.603075136 × 1012
Variance3.877778565 × 1011
MonotonicityStrictly increasing
2022-08-22T13:18:42.033520image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8905231
 
< 0.1%
46830301
 
< 0.1%
46829081
 
< 0.1%
46829071
 
< 0.1%
46829061
 
< 0.1%
46828871
 
< 0.1%
46828861
 
< 0.1%
46828851
 
< 0.1%
46828841
 
< 0.1%
46828831
 
< 0.1%
Other values (361027)361027
> 99.9%
ValueCountFrequency (%)
8905231
< 0.1%
8905241
< 0.1%
8905251
< 0.1%
8905261
< 0.1%
8905271
< 0.1%
8905281
< 0.1%
8905291
< 0.1%
8905301
< 0.1%
8905311
< 0.1%
8905321
< 0.1%
ValueCountFrequency (%)
56747521
< 0.1%
56747511
< 0.1%
56747501
< 0.1%
56747491
< 0.1%
56747481
< 0.1%
56747471
< 0.1%
56747461
< 0.1%
56747451
< 0.1%
56747441
< 0.1%
56747431
< 0.1%

filename
Categorical

HIGH CARDINALITY

Distinct733
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size2.8 MiB
2022041921/2022041921_40
 
2552
2022041921/2022041921_5
 
2237
2022041921/2022041921_7
 
2191
2022041921/2022041921_39
 
2122
2022041920/2022041920_45
 
2095
Other values (728)
349840 

Length

Max length24
Median length24
Mean length23.83824096
Min length23

Characters and Unicode

Total characters8606487
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022041900/2022041900_10
2nd row2022041900/2022041900_10
3rd row2022041900/2022041900_10
4th row2022041900/2022041900_10
5th row2022041900/2022041900_10

Common Values

ValueCountFrequency (%)
2022041921/2022041921_402552
 
0.7%
2022041921/2022041921_52237
 
0.6%
2022041921/2022041921_72191
 
0.6%
2022041921/2022041921_392122
 
0.6%
2022041920/2022041920_452095
 
0.6%
2022041919/2022041919_402089
 
0.6%
2022041919/2022041919_12036
 
0.6%
2022041920/2022041920_551983
 
0.5%
2022041920/2022041920_491970
 
0.5%
2022041919/2022041919_421960
 
0.5%
Other values (723)339802
94.1%

Length

2022-08-22T13:18:42.117746image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2022041921/2022041921_402552
 
0.7%
2022041921/2022041921_52237
 
0.6%
2022041921/2022041921_72191
 
0.6%
2022041921/2022041921_392122
 
0.6%
2022041920/2022041920_452095
 
0.6%
2022041919/2022041919_402089
 
0.6%
2022041919/2022041919_12036
 
0.6%
2022041920/2022041920_551983
 
0.5%
2022041920/2022041920_491970
 
0.5%
2022041919/2022041919_421960
 
0.5%
Other values (723)339802
94.1%

Most occurring characters

ValueCountFrequency (%)
22740391
31.8%
01686041
19.6%
11250600
14.5%
9921679
 
10.7%
4838591
 
9.7%
/361037
 
4.2%
_361037
 
4.2%
3137618
 
1.6%
8131154
 
1.5%
599506
 
1.2%
Other values (2)78833
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number7884413
91.6%
Other Punctuation361037
 
4.2%
Connector Punctuation361037
 
4.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
22740391
34.8%
01686041
21.4%
11250600
15.9%
9921679
 
11.7%
4838591
 
10.6%
3137618
 
1.7%
8131154
 
1.7%
599506
 
1.3%
741268
 
0.5%
637565
 
0.5%
Other Punctuation
ValueCountFrequency (%)
/361037
100.0%
Connector Punctuation
ValueCountFrequency (%)
_361037
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common8606487
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
22740391
31.8%
01686041
19.6%
11250600
14.5%
9921679
 
10.7%
4838591
 
9.7%
/361037
 
4.2%
_361037
 
4.2%
3137618
 
1.6%
8131154
 
1.5%
599506
 
1.2%
Other values (2)78833
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII8606487
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
22740391
31.8%
01686041
19.6%
11250600
14.5%
9921679
 
10.7%
4838591
 
9.7%
/361037
 
4.2%
_361037
 
4.2%
3137618
 
1.6%
8131154
 
1.5%
599506
 
1.2%
Other values (2)78833
 
0.9%

win_count
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct10284
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean415397.3033
Minimum1371
Maximum615253
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.8 MiB
2022-08-22T13:18:42.199615image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1371
5-th percentile348689
Q1383732
median419650
Q3454941
95-th percentile506850
Maximum615253
Range613882
Interquartile range (IQR)71209

Descriptive statistics

Standard deviation68022.63825
Coefficient of variation (CV)0.1637532013
Kurtosis8.514812051
Mean415397.3033
Median Absolute Deviation (MAD)35637
Skewness-1.927899435
Sum1.499737962 × 1011
Variance4627079314
MonotonicityNot monotonic
2022-08-22T13:18:42.291999image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
42372297
 
< 0.1%
39517796
 
< 0.1%
40143684
 
< 0.1%
39899683
 
< 0.1%
38123280
 
< 0.1%
36356377
 
< 0.1%
40156472
 
< 0.1%
38701367
 
< 0.1%
36954362
 
< 0.1%
36320760
 
< 0.1%
Other values (10274)360259
99.8%
ValueCountFrequency (%)
137150
< 0.1%
345042
< 0.1%
373350
< 0.1%
495450
< 0.1%
502230
< 0.1%
508049
< 0.1%
540844
< 0.1%
541931
< 0.1%
573134
< 0.1%
620849
< 0.1%
ValueCountFrequency (%)
61525348
< 0.1%
61506332
< 0.1%
61473132
< 0.1%
61254450
< 0.1%
61249419
 
< 0.1%
61227744
< 0.1%
61023626
< 0.1%
60995622
< 0.1%
60972126
< 0.1%
60921723
< 0.1%

sha256
Categorical

HIGH CARDINALITY

Distinct10251
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size2.8 MiB
04d66260ccaca00fdb56608492b754a4cdd6d7be24a48c6463c0c9bf13ecfd81
 
100
bf05399e1c937d3dfb38be3600967b703d706cbe34dd130e7e9b8c90044a4754
 
100
96ac2c6669518e0a3ad43796b1511ee5726fdd86bcc14714d9576be327b97ea1
 
100
db1d37343dc606670df06f44162a63ca43d00552a790969b2393db3b3da048f0
 
100
9bd5b3351e19f00d6dee37fdd922e886a1c64712f59b20e5d729f87ce15c39a1
 
98
Other values (10246)
360539 

Length

Max length64
Median length64
Mean length64
Min length64

Characters and Unicode

Total characters23106368
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c4
2nd row421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c4
3rd row421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c4
4th row421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c4
5th row421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c4

Common Values

ValueCountFrequency (%)
04d66260ccaca00fdb56608492b754a4cdd6d7be24a48c6463c0c9bf13ecfd81100
 
< 0.1%
bf05399e1c937d3dfb38be3600967b703d706cbe34dd130e7e9b8c90044a4754100
 
< 0.1%
96ac2c6669518e0a3ad43796b1511ee5726fdd86bcc14714d9576be327b97ea1100
 
< 0.1%
db1d37343dc606670df06f44162a63ca43d00552a790969b2393db3b3da048f0100
 
< 0.1%
9bd5b3351e19f00d6dee37fdd922e886a1c64712f59b20e5d729f87ce15c39a198
 
< 0.1%
138cf191475aa7d3c71ffd4f817b101e615618569389d094eebdd76f77f408a896
 
< 0.1%
e26b5b9a05f172b9d6c14409fb7f63c1344c21df0fbc56191b18061c10cacbf194
 
< 0.1%
bfba2a465cd51dd5d1bbcbaed0079c1cbbe66ef20cc60bf3aa624bdfab5f0ec892
 
< 0.1%
e8553934404d9f5bff496180b6a10e7931962f04743fd3674534f568c13cc84390
 
< 0.1%
c017751a5a0af4a61f214148b092aa1a96716a84361b414cace0edbc2c7d6c7c90
 
< 0.1%
Other values (10241)360077
99.7%

Length

2022-08-22T13:18:42.374989image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
04d66260ccaca00fdb56608492b754a4cdd6d7be24a48c6463c0c9bf13ecfd81100
 
< 0.1%
96ac2c6669518e0a3ad43796b1511ee5726fdd86bcc14714d9576be327b97ea1100
 
< 0.1%
db1d37343dc606670df06f44162a63ca43d00552a790969b2393db3b3da048f0100
 
< 0.1%
bf05399e1c937d3dfb38be3600967b703d706cbe34dd130e7e9b8c90044a4754100
 
< 0.1%
9bd5b3351e19f00d6dee37fdd922e886a1c64712f59b20e5d729f87ce15c39a198
 
< 0.1%
138cf191475aa7d3c71ffd4f817b101e615618569389d094eebdd76f77f408a896
 
< 0.1%
e26b5b9a05f172b9d6c14409fb7f63c1344c21df0fbc56191b18061c10cacbf194
 
< 0.1%
bfba2a465cd51dd5d1bbcbaed0079c1cbbe66ef20cc60bf3aa624bdfab5f0ec892
 
< 0.1%
e8553934404d9f5bff496180b6a10e7931962f04743fd3674534f568c13cc84390
 
< 0.1%
c017751a5a0af4a61f214148b092aa1a96716a84361b414cace0edbc2c7d6c7c90
 
< 0.1%
Other values (10241)360077
99.7%

Most occurring characters

ValueCountFrequency (%)
d1457692
 
6.3%
81451048
 
6.3%
71450456
 
6.3%
11449136
 
6.3%
01449006
 
6.3%
31445882
 
6.3%
a1445840
 
6.3%
c1444017
 
6.2%
91443498
 
6.2%
61441941
 
6.2%
Other values (6)8627852
37.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number14449882
62.5%
Lowercase Letter8656486
37.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
81451048
10.0%
71450456
10.0%
11449136
10.0%
01449006
10.0%
31445882
10.0%
91443498
10.0%
61441941
10.0%
51441911
10.0%
21439061
10.0%
41437943
10.0%
Lowercase Letter
ValueCountFrequency (%)
d1457692
16.8%
a1445840
16.7%
c1444017
16.7%
f1441070
16.6%
e1436031
16.6%
b1431836
16.5%

Most occurring scripts

ValueCountFrequency (%)
Common14449882
62.5%
Latin8656486
37.5%

Most frequent character per script

Common
ValueCountFrequency (%)
81451048
10.0%
71450456
10.0%
11449136
10.0%
01449006
10.0%
31445882
10.0%
91443498
10.0%
61441941
10.0%
51441911
10.0%
21439061
10.0%
41437943
10.0%
Latin
ValueCountFrequency (%)
d1457692
16.8%
a1445840
16.7%
c1444017
16.7%
f1441070
16.6%
e1436031
16.6%
b1431836
16.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII23106368
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
d1457692
 
6.3%
81451048
 
6.3%
71450456
 
6.3%
11449136
 
6.3%
01449006
 
6.3%
31445882
 
6.3%
a1445840
 
6.3%
c1444017
 
6.2%
91443498
 
6.2%
61441941
 
6.2%
Other values (6)8627852
37.3%

imp_hash
Categorical

CONSTANT
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.8 MiB
25c7ac00c91884fd2923a489ae9dfbca
361037 

Length

Max length32
Median length32
Mean length32
Min length32

Characters and Unicode

Total characters11553184
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row25c7ac00c91884fd2923a489ae9dfbca
2nd row25c7ac00c91884fd2923a489ae9dfbca
3rd row25c7ac00c91884fd2923a489ae9dfbca
4th row25c7ac00c91884fd2923a489ae9dfbca
5th row25c7ac00c91884fd2923a489ae9dfbca

Common Values

ValueCountFrequency (%)
25c7ac00c91884fd2923a489ae9dfbca361037
100.0%

Length

2022-08-22T13:18:42.441238image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-22T13:18:42.519889image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
25c7ac00c91884fd2923a489ae9dfbca361037
100.0%

Most occurring characters

ValueCountFrequency (%)
c1444148
12.5%
a1444148
12.5%
91444148
12.5%
21083111
9.4%
81083111
9.4%
0722074
 
6.2%
4722074
 
6.2%
f722074
 
6.2%
d722074
 
6.2%
5361037
 
3.1%
Other values (5)1805185
15.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number6498666
56.2%
Lowercase Letter5054518
43.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
91444148
22.2%
21083111
16.7%
81083111
16.7%
0722074
11.1%
4722074
11.1%
5361037
 
5.6%
7361037
 
5.6%
1361037
 
5.6%
3361037
 
5.6%
Lowercase Letter
ValueCountFrequency (%)
c1444148
28.6%
a1444148
28.6%
f722074
14.3%
d722074
14.3%
e361037
 
7.1%
b361037
 
7.1%

Most occurring scripts

ValueCountFrequency (%)
Common6498666
56.2%
Latin5054518
43.8%

Most frequent character per script

Common
ValueCountFrequency (%)
91444148
22.2%
21083111
16.7%
81083111
16.7%
0722074
11.1%
4722074
11.1%
5361037
 
5.6%
7361037
 
5.6%
1361037
 
5.6%
3361037
 
5.6%
Latin
ValueCountFrequency (%)
c1444148
28.6%
a1444148
28.6%
f722074
14.3%
d722074
14.3%
e361037
 
7.1%
b361037
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII11553184
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c1444148
12.5%
a1444148
12.5%
91444148
12.5%
21083111
9.4%
81083111
9.4%
0722074
 
6.2%
4722074
 
6.2%
f722074
 
6.2%
d722074
 
6.2%
5361037
 
3.1%
Other values (5)1805185
15.6%

sec_chi2
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct8562
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3896430.123
Minimum37912.29
Maximum73113600
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.8 MiB
2022-08-22T13:18:42.591688image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum37912.29
5-th percentile65485.4
Q11044480
median1044480
Q32088960
95-th percentile7311360
Maximum73113600
Range73075687.71
Interquartile range (IQR)1044480

Descriptive statistics

Standard deviation13135118.62
Coefficient of variation (CV)3.371064848
Kurtosis23.47635445
Mean3896430.123
Median Absolute Deviation (MAD)362902
Skewness5.025302897
Sum1.406755442 × 1012
Variance1.725313412 × 1014
MonotonicityNot monotonic
2022-08-22T13:18:42.681719image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1044480157453
43.6%
208896083165
23.0%
7311360012326
 
3.4%
498667.8810295
 
2.9%
273516.7510295
 
2.9%
940514.510295
 
2.9%
281018810295
 
2.9%
134540.9510295
 
2.9%
65485.410290
 
2.9%
4015190.2510290
 
2.9%
Other values (8552)36038
 
10.0%
ValueCountFrequency (%)
37912.291
 
< 0.1%
37913.591
 
< 0.1%
37919.1413
 
< 0.1%
379201
 
< 0.1%
37922.221
 
< 0.1%
37922.541
 
< 0.1%
37923.0310277
2.8%
613031
 
< 0.1%
61381.881
 
< 0.1%
61682.751
 
< 0.1%
ValueCountFrequency (%)
7311360012326
3.4%
54312960245
 
0.1%
73113606488
1.8%
4015315.751
 
< 0.1%
4015190.2510290
2.9%
40151901
 
< 0.1%
4015189.751
 
< 0.1%
4015189.251
 
< 0.1%
40151821
 
< 0.1%
281018810295
2.9%

sec_entropy
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct406
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.210155109
Minimum0
Maximum7.87
Zeros259677
Zeros (%)71.9%
Negative0
Negative (%)0.0%
Memory size2.8 MiB
2022-08-22T13:18:42.777646image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30.54
95-th percentile7.82
Maximum7.87
Range7.87
Interquartile range (IQR)0.54

Descriptive statistics

Standard deviation2.365520452
Coefficient of variation (CV)1.954725005
Kurtosis1.816032852
Mean1.210155109
Median Absolute Deviation (MAD)0
Skewness1.801276212
Sum436910.77
Variance5.595687007
MonotonicityNot monotonic
2022-08-22T13:18:42.869115image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0259677
71.9%
0.5410318
 
2.9%
4.0910299
 
2.9%
2.7810296
 
2.9%
7.8210295
 
2.9%
4.0410295
 
2.9%
6.6710295
 
2.9%
7.8710295
 
2.9%
5.1410295
 
2.9%
0.485616
 
1.6%
Other values (396)13356
 
3.7%
ValueCountFrequency (%)
0259677
71.9%
0.234
 
< 0.1%
0.242
 
< 0.1%
0.311
 
< 0.1%
0.322
 
< 0.1%
0.3322
 
< 0.1%
0.3430
 
< 0.1%
0.362
 
< 0.1%
0.461
 
< 0.1%
0.485616
 
1.6%
ValueCountFrequency (%)
7.8710295
2.9%
7.8210295
2.9%
6.6710295
2.9%
5.75520
 
0.1%
5.74609
 
0.2%
5.7334
 
< 0.1%
5.7266
 
< 0.1%
5.711
 
< 0.1%
5.5710
 
< 0.1%
5.565
 
< 0.1%

sec_md5
Categorical

HIGH CARDINALITY

Distinct9182
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size2.8 MiB
620f0b67a91f7f74151bc5be745b7110
157453 
0829f71740aab1ab98b33eae21dee122
83165 
4579108cda3cebc6432027a86e7b7a9b
 
12326
b60eaab7c709450be3cba1c56615936c
 
10295
a24d116ab001c6148d20035da2529014
 
10295
Other values (9177)
87503 

Length

Max length32
Median length32
Mean length32
Min length32

Characters and Unicode

Total characters11553184
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9083 ?
Unique (%)2.5%

Sample

1st row072b707ef80f7c15338cf1fd7c7212aa
2nd row9f46fc5a3fcb244f69598b159edfecd8
3rd rowa24d116ab001c6148d20035da2529014
4th rowca2ef02cb2d2c48858ceb3137e44019d
5th row218e10d0dff11714c4062e870cc733ae

Common Values

ValueCountFrequency (%)
620f0b67a91f7f74151bc5be745b7110157453
43.6%
0829f71740aab1ab98b33eae21dee12283165
23.0%
4579108cda3cebc6432027a86e7b7a9b12326
 
3.4%
b60eaab7c709450be3cba1c56615936c10295
 
2.9%
a24d116ab001c6148d20035da252901410295
 
2.9%
ca2ef02cb2d2c48858ceb3137e44019d10295
 
2.9%
bac11b3e0f44cd6f7fdbe64b8961f1d410295
 
2.9%
0e0a578022c8fef3fb155e5713e8195c10295
 
2.9%
072b707ef80f7c15338cf1fd7c7212aa10290
 
2.9%
218e10d0dff11714c4062e870cc733ae10290
 
2.9%
Other values (9172)36038
 
10.0%

Length

2022-08-22T13:18:42.952796image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
620f0b67a91f7f74151bc5be745b7110157453
43.6%
0829f71740aab1ab98b33eae21dee12283165
23.0%
4579108cda3cebc6432027a86e7b7a9b12326
 
3.4%
b60eaab7c709450be3cba1c56615936c10295
 
2.9%
a24d116ab001c6148d20035da252901410295
 
2.9%
ca2ef02cb2d2c48858ceb3137e44019d10295
 
2.9%
bac11b3e0f44cd6f7fdbe64b8961f1d410295
 
2.9%
0e0a578022c8fef3fb155e5713e8195c10295
 
2.9%
218e10d0dff11714c4062e870cc733ae10290
 
2.9%
072b707ef80f7c15338cf1fd7c7212aa10290
 
2.9%
Other values (9172)36038
 
10.0%

Most occurring characters

ValueCountFrequency (%)
11458501
12.6%
71242407
10.8%
b1101982
9.5%
0926630
 
8.0%
f802065
 
6.9%
e726248
 
6.3%
2724404
 
6.3%
a709087
 
6.1%
5690820
 
6.0%
4617735
 
5.3%
Other values (6)2553305
22.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number7448121
64.5%
Lowercase Letter4105063
35.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
11458501
19.6%
71242407
16.7%
0926630
12.4%
2724404
9.7%
5690820
9.3%
4617735
8.3%
9517054
 
6.9%
6515823
 
6.9%
8391529
 
5.3%
3363218
 
4.9%
Lowercase Letter
ValueCountFrequency (%)
b1101982
26.8%
f802065
19.5%
e726248
17.7%
a709087
17.3%
c477312
11.6%
d288369
 
7.0%

Most occurring scripts

ValueCountFrequency (%)
Common7448121
64.5%
Latin4105063
35.5%

Most frequent character per script

Common
ValueCountFrequency (%)
11458501
19.6%
71242407
16.7%
0926630
12.4%
2724404
9.7%
5690820
9.3%
4617735
8.3%
9517054
 
6.9%
6515823
 
6.9%
8391529
 
5.3%
3363218
 
4.9%
Latin
ValueCountFrequency (%)
b1101982
26.8%
f802065
19.5%
e726248
17.7%
a709087
17.3%
c477312
11.6%
d288369
 
7.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII11553184
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11458501
12.6%
71242407
10.8%
b1101982
9.5%
0926630
 
8.0%
f802065
 
6.9%
e726248
 
6.3%
2724404
 
6.3%
a709087
 
6.1%
5690820
 
6.0%
4617735
 
5.3%
Other values (6)2553305
22.1%

raw_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35764.22893
Minimum4096
Maximum286720
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.8 MiB
2022-08-22T13:18:43.013517image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum4096
5-th percentile4096
Q14096
median4096
Q38192
95-th percentile270336
Maximum286720
Range282624
Interquartile range (IQR)4096

Descriptive statistics

Standard deviation76828.9883
Coefficient of variation (CV)2.148207597
Kurtosis4.484647564
Mean35764.22893
Median Absolute Deviation (MAD)0
Skewness2.447244966
Sum1.291220992 × 1010
Variance5902693443
MonotonicityNot monotonic
2022-08-22T13:18:43.072595image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
4096204310
56.6%
819284556
23.4%
28672013462
 
3.7%
16384010295
 
2.9%
27033610295
 
2.9%
19251210295
 
2.9%
6553610295
 
2.9%
1228810295
 
2.9%
286726895
 
1.9%
212992339
 
0.1%
ValueCountFrequency (%)
4096204310
56.6%
819284556
23.4%
1228810295
 
2.9%
286726895
 
1.9%
6553610295
 
2.9%
16384010295
 
2.9%
19251210295
 
2.9%
212992339
 
0.1%
27033610295
 
2.9%
28672013462
 
3.7%
ValueCountFrequency (%)
28672013462
 
3.7%
27033610295
 
2.9%
212992339
 
0.1%
19251210295
 
2.9%
16384010295
 
2.9%
6553610295
 
2.9%
286726895
 
1.9%
1228810295
 
2.9%
819284556
23.4%
4096204310
56.6%

virtual_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct364
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33554.18039
Minimum123
Maximum283074
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.8 MiB
2022-08-22T13:18:43.152489image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum123
5-th percentile431
Q11787
median3062
Q37782
95-th percentile269833
Maximum283074
Range282951
Interquartile range (IQR)5995

Descriptive statistics

Standard deviation76880.4531
Coefficient of variation (CV)2.291233229
Kurtosis4.444827051
Mean33554.18039
Median Absolute Deviation (MAD)2040
Skewness2.439407684
Sum1.211430063 × 1010
Variance5910604069
MonotonicityNot monotonic
2022-08-22T13:18:43.236795image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
303828026
 
7.8%
475119847
 
5.5%
57116854
 
4.7%
230216713
 
4.6%
472816156
 
4.5%
184616091
 
4.5%
797813592
 
3.8%
28299612265
 
3.4%
16169410295
 
2.9%
6466410295
 
2.9%
Other values (354)200903
55.6%
ValueCountFrequency (%)
12326
 
< 0.1%
13217
 
< 0.1%
180880
0.2%
23220
 
< 0.1%
249663
0.2%
259422
0.1%
262472
0.1%
268234
 
0.1%
276135
 
< 0.1%
311939
0.3%
ValueCountFrequency (%)
2830741197
 
0.3%
28299612265
3.4%
26983310295
2.9%
210737339
 
0.1%
19088210295
2.9%
16169410295
2.9%
6466410295
2.9%
278566895
1.9%
1042810295
2.9%
8071218
 
0.1%

virtual_address
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct212
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean893713.8976
Minimum4096
Maximum2068480
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.8 MiB
2022-08-22T13:18:43.329349image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum4096
5-th percentile167936
Q1712704
median1060864
Q31122304
95-th percentile1200128
Maximum2068480
Range2064384
Interquartile range (IQR)409600

Descriptive statistics

Standard deviation362645.281
Coefficient of variation (CV)0.4057733487
Kurtosis0.1088205522
Mean893713.8976
Median Absolute Deviation (MAD)73728
Skewness-1.234143403
Sum3.226637844 × 1011
Variance1.315115998 × 1011
MonotonicityNot monotonic
2022-08-22T13:18:43.419577image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
409610295
 
2.9%
16793610295
 
2.9%
17203210295
 
2.9%
17612810295
 
2.9%
18022410295
 
2.9%
45056010295
 
2.9%
45465610295
 
2.9%
64716810295
 
2.9%
71270410295
 
2.9%
72499210295
 
2.9%
Other values (202)258087
71.5%
ValueCountFrequency (%)
409610295
2.9%
16793610295
2.9%
17203210295
2.9%
17612810295
2.9%
18022410295
2.9%
45056010295
2.9%
45465610295
2.9%
64716810295
2.9%
71270410295
2.9%
72499210295
2.9%
ValueCountFrequency (%)
20684802
 
< 0.1%
20520962
 
< 0.1%
20070402
 
< 0.1%
19537922
 
< 0.1%
18268162
 
< 0.1%
18227202
 
< 0.1%
18145282
 
< 0.1%
17981442
 
< 0.1%
17940486
< 0.1%
17899522
 
< 0.1%

sec_name
Categorical

HIGH CARDINALITY

Distinct15918
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Memory size2.8 MiB
.rdata
 
20590
.text
 
10295
.crt1
 
10295
.data
 
10295
.pdata
 
10295
Other values (15913)
299267 

Length

Max length7
Median length6
Mean length5.444513997
Min length4

Characters and Unicode

Total characters1965671
Distinct characters30
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6369 ?
Unique (%)1.8%

Sample

1st row.text
2nd row.rdata
3rd row.crt1
4th row.rdata
5th row.data

Common Values

ValueCountFrequency (%)
.rdata20590
 
5.7%
.text10295
 
2.9%
.crt110295
 
2.9%
.data10295
 
2.9%
.pdata10295
 
2.9%
qwTG10295
 
2.9%
.rsrc10295
 
2.9%
.reloc10295
 
2.9%
.lqen9890
 
2.7%
.vqb9890
 
2.7%
Other values (15908)248602
68.9%

Length

2022-08-22T13:18:43.502620image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
rdata20590
 
5.7%
qwtg10295
 
2.9%
reloc10295
 
2.9%
rsrc10295
 
2.9%
text10295
 
2.9%
pdata10295
 
2.9%
crt110295
 
2.9%
data10295
 
2.9%
lqen9890
 
2.7%
gjd9890
 
2.7%
Other values (15908)248602
68.9%

Most occurring characters

ValueCountFrequency (%)
.350742
17.8%
a120041
 
6.1%
r112921
 
5.7%
t110130
 
5.6%
q98491
 
5.0%
d82776
 
4.2%
c72871
 
3.7%
l71620
 
3.6%
e68716
 
3.5%
w59080
 
3.0%
Other values (20)818283
41.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1584044
80.6%
Other Punctuation350742
 
17.8%
Uppercase Letter20590
 
1.0%
Decimal Number10295
 
0.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a120041
 
7.6%
r112921
 
7.1%
t110130
 
7.0%
q98491
 
6.2%
d82776
 
5.2%
c72871
 
4.6%
l71620
 
4.5%
e68716
 
4.3%
w59080
 
3.7%
g55936
 
3.5%
Other values (16)731462
46.2%
Uppercase Letter
ValueCountFrequency (%)
G10295
50.0%
T10295
50.0%
Other Punctuation
ValueCountFrequency (%)
.350742
100.0%
Decimal Number
ValueCountFrequency (%)
110295
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1604634
81.6%
Common361037
 
18.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a120041
 
7.5%
r112921
 
7.0%
t110130
 
6.9%
q98491
 
6.1%
d82776
 
5.2%
c72871
 
4.5%
l71620
 
4.5%
e68716
 
4.3%
w59080
 
3.7%
g55936
 
3.5%
Other values (18)752052
46.9%
Common
ValueCountFrequency (%)
.350742
97.1%
110295
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII1965671
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.350742
17.8%
a120041
 
6.1%
r112921
 
5.7%
t110130
 
5.6%
q98491
 
5.0%
d82776
 
4.2%
c72871
 
3.7%
l71620
 
3.6%
e68716
 
3.5%
w59080
 
3.0%
Other values (20)818283
41.6%

Interactions

2022-08-22T13:18:39.462297image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:33.120743image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:34.000253image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:34.860447image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:35.886588image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:36.785930image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:37.691885image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:38.580863image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:39.568675image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:33.237384image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:34.104433image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:34.971985image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:35.994827image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:36.894759image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:37.800783image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:38.688026image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:39.673898image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:33.342641image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:34.207968image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:35.079930image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:36.104342image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:37.003243image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:37.907974image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:38.795871image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:39.784151image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:33.452998image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:34.316777image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:35.191942image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:36.218520image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:37.116781image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:38.019949image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:38.907758image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:39.896497image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:33.568486image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:34.427660image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:35.305235image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:36.337275image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:37.232403image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:38.133751image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:39.020134image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:40.009586image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:33.679923image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:34.540971image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:35.556884image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:36.452227image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:37.352426image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:38.248282image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:39.134745image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:40.118104image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:33.790393image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:34.648034image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:35.668737image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:36.565227image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:37.466882image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:38.360149image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:39.246975image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:40.225151image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:33.895796image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:34.756456image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:35.776827image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:36.675492image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:37.583028image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:38.469704image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-22T13:18:39.355653image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-08-22T13:18:43.584679image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-08-22T13:18:43.675575image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-08-22T13:18:43.764552image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-08-22T13:18:43.853454image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-08-22T13:18:40.490560image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-08-22T13:18:40.957019image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexUnnamed: 0filenamewin_countsha256imp_hashsec_chi2sec_entropysec_md5raw_sizevirtual_sizevirtual_addresssec_name
08905238905232022041900/2022041900_10169033421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c425c7ac00c91884fd2923a489ae9dfbca65485.407.82072b707ef80f7c15338cf1fd7c7212aa1638401616944096.text
18905248905242022041900/2022041900_10169033421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c425c7ac00c91884fd2923a489ae9dfbca957362.250.489f46fc5a3fcb244f69598b159edfecd840963913167936.rdata
28905258905252022041900/2022041900_10169033421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c425c7ac00c91884fd2923a489ae9dfbca498667.882.78a24d116ab001c6148d20035da252901440961787172032.crt1
38905268905262022041900/2022041900_10169033421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c425c7ac00c91884fd2923a489ae9dfbca273516.754.04ca2ef02cb2d2c48858ceb3137e44019d40963264176128.rdata
48905278905272022041900/2022041900_10169033421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c425c7ac00c91884fd2923a489ae9dfbca4015190.256.67218e10d0dff11714c4062e870cc733ae270336269833180224.data
58905288905282022041900/2022041900_10169033421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c425c7ac00c91884fd2923a489ae9dfbca940514.500.54bac11b3e0f44cd6f7fdbe64b8961f1d440962886450560.pdata
68905298905292022041900/2022041900_10169033421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c425c7ac00c91884fd2923a489ae9dfbca37923.037.87f096c568fbd7d92957f53fd1a92776c1192512190882454656qwTG
78905308905302022041900/2022041900_10169033421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c425c7ac00c91884fd2923a489ae9dfbca2810188.004.090e0a578022c8fef3fb155e5713e8195c6553664664647168.rsrc
88905318905312022041900/2022041900_10169033421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c425c7ac00c91884fd2923a489ae9dfbca134540.955.14b60eaab7c709450be3cba1c56615936c1228810428712704.reloc
98905328905322022041900/2022041900_10169033421a0e5ff1a6ef6b4a1c79a5fecd9555ef2e0376177ab8e50b22aae1fefd78c425c7ac00c91884fd2923a489ae9dfbca73113600.000.004579108cda3cebc6432027a86e7b7a9b286720282996724992.lqen

Last rows

df_indexUnnamed: 0filenamewin_countsha256imp_hashsec_chi2sec_entropysec_md5raw_sizevirtual_sizevirtual_addresssec_name
361027567474356747432022042101/2022042101_4561525312f8bc0299b6f37f90cfd52d97a13ba0a132049ff52f4ec58d8a47b89b2c063925c7ac00c91884fd2923a489ae9dfbca1044480.000.00620f0b67a91f7f74151bc5be745b711040963181163264.ures
361028567474456747442022042101/2022042101_4561525312f8bc0299b6f37f90cfd52d97a13ba0a132049ff52f4ec58d8a47b89b2c063925c7ac00c91884fd2923a489ae9dfbca1044480.000.00620f0b67a91f7f74151bc5be745b7110409630381167360.ycwyx
361029567474556747452022042101/2022042101_4561525312f8bc0299b6f37f90cfd52d97a13ba0a132049ff52f4ec58d8a47b89b2c063925c7ac00c91884fd2923a489ae9dfbca1044480.000.00620f0b67a91f7f74151bc5be745b7110409630381171456.klsrvp
361030567474656747462022042101/2022042101_4561525312f8bc0299b6f37f90cfd52d97a13ba0a132049ff52f4ec58d8a47b89b2c063925c7ac00c91884fd2923a489ae9dfbca2088960.000.000829f71740aab1ab98b33eae21dee122819243881175552.bjcea
361031567474756747472022042101/2022042101_4561525312f8bc0299b6f37f90cfd52d97a13ba0a132049ff52f4ec58d8a47b89b2c063925c7ac00c91884fd2923a489ae9dfbca1044480.000.00620f0b67a91f7f74151bc5be745b711040965711183744.omsph
361032567474856747482022042101/2022042101_4561525312f8bc0299b6f37f90cfd52d97a13ba0a132049ff52f4ec58d8a47b89b2c063925c7ac00c91884fd2923a489ae9dfbca73113600.000.004579108cda3cebc6432027a86e7b7a9b2867202830741187840.lts
361033567474956747492022042101/2022042101_4561525312f8bc0299b6f37f90cfd52d97a13ba0a132049ff52f4ec58d8a47b89b2c063925c7ac00c91884fd2923a489ae9dfbca1044480.000.00620f0b67a91f7f74151bc5be745b711040967301474560.ceki
361034567475056747502022042101/2022042101_4561525312f8bc0299b6f37f90cfd52d97a13ba0a132049ff52f4ec58d8a47b89b2c063925c7ac00c91884fd2923a489ae9dfbca73113600.000.004579108cda3cebc6432027a86e7b7a9b2867202830741478656.mub
361035567475156747512022042101/2022042101_4561525312f8bc0299b6f37f90cfd52d97a13ba0a132049ff52f4ec58d8a47b89b2c063925c7ac00c91884fd2923a489ae9dfbca1044480.000.00620f0b67a91f7f74151bc5be745b711040964311765376.xer
361036567475256747522022042101/2022042101_4561525312f8bc0299b6f37f90cfd52d97a13ba0a132049ff52f4ec58d8a47b89b2c063925c7ac00c91884fd2923a489ae9dfbca534986.252.473bdc7903f23377c192129423e06221b9409613951769472.lgp